Skip to content

Fix parse_azure_endpoint passing query string to AsyncAzureOpenAI#231

Merged
gvanrossum merged 2 commits intomicrosoft:mainfrom
KRRT7:fix/parse-azure-endpoint
Apr 10, 2026
Merged

Fix parse_azure_endpoint passing query string to AsyncAzureOpenAI#231
gvanrossum merged 2 commits intomicrosoft:mainfrom
KRRT7:fix/parse-azure-endpoint

Conversation

@KRRT7
Copy link
Copy Markdown
Contributor

@KRRT7 KRRT7 commented Apr 10, 2026

Stack: 1/4 — merge this first, then #229, #230, #232.


  • parse_azure_endpoint returned the full URL including ?api-version=...
  • AsyncAzureOpenAI appends /openai/ to azure_endpoint, producing a mangled URL with the query string in the path
  • Now strips the query string with str.split("?", 1)[0] before returning
  • Added 6 unit tests covering: basic URL, no version, separate env var, missing env var, empty query string

Benchmark

No performance impact — this is a correctness fix.


Generated by codeflash optimization agent

KRRT7 added 2 commits April 9, 2026 22:45
uv 0.10.x is current; the <0.10.0 constraint caused build warnings.
parse_azure_endpoint returned the raw URL including ?api-version=...
which AsyncAzureOpenAI then mangled into invalid paths like
...?api-version=2024-06-01/openai/. Strip the query string before
returning — api_version is already returned as a separate value and
passed to the SDK independently.
Copy link
Copy Markdown
Collaborator

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@gvanrossum gvanrossum merged commit 02ca2d2 into microsoft:main Apr 10, 2026
16 checks passed
@bmerkle
Copy link
Copy Markdown
Collaborator

bmerkle commented Apr 13, 2026

this break current setup, e.g. .env files containing URLs with api version information (which is the default in Azure Foundry setup).
please see bugreport #238

bmerkle added a commit to bmerkle/typeagent-py that referenced this pull request Apr 13, 2026
 Updated parse_azure_endpoint in utils.py:200 to strip the /openai/deployments/... path from the endpoint URL. Previously it only removed the query string, leaving the full deployment path — which AsyncAzureOpenAI then duplicated, causing the 404.

 Updated test_online.py to use create_chat_model() → model._model.request() — the same _make_azure_provider → AzureProvider(openai_client=AsyncAzureOpenAI(...)) code path used by the rest of the codebase.
gvanrossum pushed a commit that referenced this pull request Apr 14, 2026
fixes bugs described in #238 
- regression URL parsing in PR #231
- uv.lock updated to newer versions in many cases
bmerkle added a commit that referenced this pull request Apr 22, 2026
**Stack: 3/4** — depends on #229. Merge #231, #229, then this PR.

---

- Add `add_terms_batch` and `add_properties_batch` to
`ITermToSemanticRefIndex` and `IPropertyToSemanticRefIndex` interfaces
- SQLite backend uses `executemany` instead of individual
`cursor.execute()` calls (~1000+ calls per indexing batch reduced to
2-3)
- Restructure `add_metadata_to_index_from_list` and
`add_to_property_index` to collect all data first (pure functions), then
batch-insert
- Memory backend implements batch methods as loops for interface
compatibility

## Benchmark

### Azure Standard_D2s_v5 -- 2 vCPU, 8 GiB RAM, Python 3.13

#### Indexing Pipeline (pytest-async-benchmark pedantic, 20 rounds, 3
warmup)

Only the hot path (`add_messages_with_indexing`) is timed -- DB
creation, storage init, and teardown are excluded.

| Benchmark | Before (min) | After (min) | Speedup |
|:---|---:|---:|---:|
| `add_messages_with_indexing` (200 msgs) | 28.8 ms | 25.0 ms |
**1.16x** |
| `add_messages_with_indexing` (50 msgs) | 7.8 ms | 6.7 ms | **1.16x** |
| VTT ingest (40 msgs) | 6.9 ms | 6.1 ms | **1.14x** |

Consistent ~14-16% improvement -- `executemany` amortizes per-call
overhead.

<details>
<summary><b>Reproduce the benchmark locally</b></summary>

Save the benchmark file below as
`tests/benchmarks/test_benchmark_indexing.py`, then:

```bash
pip install 'pytest-async-benchmark @ git+https://github.com/KRRT7/pytest-async-benchmark.git@feat/pedantic-mode' pytest-asyncio

# Run on main
git checkout main
python -m pytest tests/benchmarks/test_benchmark_indexing.py -v -s

# Run on this branch
git checkout perf/batch-inserts
python -m pytest tests/benchmarks/test_benchmark_indexing.py -v -s
```
</details>

---

*Generated by codeflash optimization agent*

---------

Co-authored-by: Bernhard Merkle <bernhard.merkle@gmail.com>
bmerkle added a commit that referenced this pull request Apr 22, 2026
**Stack: 4/4** — depends on #230. Merge #231, #229, #230, then this PR.

---

- Five call sites used `get_item()` per scored ref — one SELECT and full
deserialization per match (N+1 pattern)
- Added `get_metadata_multiple` to `ISemanticRefCollection` that fetches
only `semref_id, range_json, knowledge_type` in a single batch query
- Replaced the N+1 loop with one `get_metadata_multiple` call at each
site
- Further optimized scope-filtering: binary search in `contains_range`,
inline tuple comparisons in `TextRange`, skip pydantic validation in
`get_metadata_multiple`

### Call sites optimized

1. `lookup_term_filtered` — batch metadata, filter by
knowledge_type/range
2. `lookup_property_in_property_index` — batch metadata, filter by range
scope
3. `SemanticRefAccumulator.group_matches_by_type` — batch metadata,
group by knowledge_type
4. `SemanticRefAccumulator.get_matches_in_scope` — batch metadata,
filter by range scope
5. `get_scored_semantic_refs_from_ordinals_iter` — two-phase: metadata
filter then batch fetch

### Additional optimizations

- **Binary search in `TextRangeCollection.contains_range`**: replaced
O(n) linear scan with `bisect_right` keyed on `start`, reducing
scope-filtering from ~25ms to ~9ms
- **Inline tuple comparisons in `TextRange`**: replaced `TextLocation`
allocations in `__eq__`/`__lt__`/`__contains__` with a shared
`_effective_end` returning tuples
- **Skip pydantic validation in `get_metadata_multiple`**: construct
`TextLocation`/`TextRange` directly from JSON instead of going through
`__pydantic_validator__`

## Benchmark

### Azure Standard_D2s_v5 — 2 vCPU, 8 GiB RAM, Python 3.13

#### Query (pytest-async-benchmark pedantic, 200 rounds)

200 matches against a 200-message indexed SQLite transcript. Only the
function under test is timed.

| Function | Before (median) | After (median) | Speedup |
|:---|---:|---:|---:|
| `lookup_term_filtered` | 2.650 ms | 1.184 ms | **2.24x** |
| `group_matches_by_type` | 2.428 ms | 978 μs | **2.48x** |
| `get_scored_semantic_refs_from_ordinals_iter` | 2.541 ms | 2.946 ms |
0.86x |
| `lookup_property_in_property_index` | 25.306 ms | 9.365 ms | **2.70x**
|
| `get_matches_in_scope` | 25.011 ms | 9.160 ms | **2.73x** |

<details>
<summary><b>Reproduce the benchmark locally</b></summary>

```bash
pip install 'pytest-async-benchmark @ git+https://github.com/KRRT7/pytest-async-benchmark.git@feat/pedantic-mode' pytest-asyncio
python -m pytest tests/benchmarks/test_benchmark_query.py -v -s
```
</details>

---

*Generated by codeflash optimization agent*

---------

Co-authored-by: Bernhard Merkle <bernhard.merkle@gmail.com>
bmerkle pushed a commit that referenced this pull request Apr 23, 2026
fixed regressions which were caused by #231 that only showed when using real API keys.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants